-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[GlobalISel] Fix [SU]SUBO combine #116998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The carry-out can participate in several positions: * the condition of a select * the condition of a conditional branch * the carry-in of addes and subes (narrowScalarAddSub) ADCS: Add with carry and set flags Add with Carry, setting flags, adds two register values and the Carry flag value, and writes the result to the destination register. It updates the condition flags based on the result. Note that the carry-in participates in the addition. It must be a 0 or 1 value. The carry-in/out values must be 0 or 1 for the addos, subos, addes, and subos. Including, but limited to the lowering for G_USUBO is buggy: `MIRBuilder.buildICmp(CmpInst::ICMP_ULT, BorrowOut, LHS, RHS);` The carry-out is target dependent and can lead to surprising and non-deterministic results when used as carry-in. narrowScalarAddSub can build chains of addes/subes. When the ops are not negal for the current target and get lowered. They become chains of adds/subs and icmps. The result will be come target dependent, which is incorrect. Lowering should write selects between one and zero constants into carry-out registers. AARrch64: ZeroOrOneBooleanContent AMDGPU: ZeroOrNegativeOneBooleanContent and ZeroOrOneBooleanContent RISCV: ZeroOrOneBooleanContent
|
@llvm/pr-subscribers-backend-aarch64 Author: Thorsten Schütt (tschuett) ChangesThe carry-out can participate in several positions:
ADCS: Add with carry and set flags Add with Carry, setting flags, adds two register values and the Carry flag value, and writes the result to the destination register. It updates the condition flags based on the result. Note that the carry-in participates in the addition. It must be a 0 or 1 value. The carry-in/out values must be 0 or 1 for the addos, subos, addes, and subos. Including, but limited to the lowering for G_USUBO is buggy: The carry-out is target dependent and can lead to surprising and non-deterministic results when used as carry-in. narrowScalarAddSub can build chains of addes/subes. When the ops are not negal for the current target and get lowered. They become chains of adds/subs and icmps. The result will be come target dependent, which is incorrect. Lowering should write selects between one and zero constants into carry-out registers. AARrch64: ZeroOrOneBooleanContent Full diff: https://github.com/llvm/llvm-project/pull/116998.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index d95fc8cfbcf558..027e4ee7159b41 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -7830,9 +7830,7 @@ bool CombinerHelper::matchSuboCarryOut(const MachineInstr &MI,
case ConstantRange::OverflowResult::AlwaysOverflowsHigh: {
MatchInfo = [=](MachineIRBuilder &B) {
B.buildSub(Dst, LHS, RHS);
- B.buildConstant(Carry, getICmpTrueVal(getTargetLowering(),
- /*isVector=*/CarryTy.isVector(),
- /*isFP=*/false));
+ B.buildConstant(Carry, 1);
};
return true;
}
@@ -7855,9 +7853,7 @@ bool CombinerHelper::matchSuboCarryOut(const MachineInstr &MI,
case ConstantRange::OverflowResult::AlwaysOverflowsHigh: {
MatchInfo = [=](MachineIRBuilder &B) {
B.buildSub(Dst, LHS, RHS);
- B.buildConstant(Carry, getICmpTrueVal(getTargetLowering(),
- /*isVector=*/CarryTy.isVector(),
- /*isFP=*/false));
+ B.buildConstant(Carry, 1);
};
return true;
}
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir b/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
index 20cba54923548e..8f24ee76f82b32 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
@@ -277,3 +277,26 @@ body: |
$q1 = COPY %o_wide
RET_ReallyLR implicit $w0
...
+---
+name: usub_may_carry_non_const
+body: |
+ bb.0:
+ liveins: $w0, $w1
+ ; CHECK-LABEL: name: usub_may_carry_non_const
+ ; CHECK: liveins: $w0, $w1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $w0
+ ; CHECK-NEXT: %sub:_(s32), %o:_(s1) = G_USUBO [[COPY]], [[COPY1]]
+ ; CHECK-NEXT: %o_wide:_(s32) = G_ZEXT %o(s1)
+ ; CHECK-NEXT: $w0 = COPY %sub(s32)
+ ; CHECK-NEXT: $w1 = COPY %o_wide(s32)
+ ; CHECK-NEXT: RET_ReallyLR implicit $w0
+ %0:_(s32) = COPY $w0
+ %1:_(s32) = COPY $w0
+ %sub:_(s32), %o:_(s1) = G_USUBO %0, %1
+ %o_wide:_(s32) = G_ZEXT %o(s1)
+ $w0 = COPY %sub(s32)
+ $w1 = COPY %o_wide
+ RET_ReallyLR implicit $w0
+...
|
|
@llvm/pr-subscribers-llvm-globalisel Author: Thorsten Schütt (tschuett) ChangesThe carry-out can participate in several positions:
ADCS: Add with carry and set flags Add with Carry, setting flags, adds two register values and the Carry flag value, and writes the result to the destination register. It updates the condition flags based on the result. Note that the carry-in participates in the addition. It must be a 0 or 1 value. The carry-in/out values must be 0 or 1 for the addos, subos, addes, and subos. Including, but limited to the lowering for G_USUBO is buggy: The carry-out is target dependent and can lead to surprising and non-deterministic results when used as carry-in. narrowScalarAddSub can build chains of addes/subes. When the ops are not negal for the current target and get lowered. They become chains of adds/subs and icmps. The result will be come target dependent, which is incorrect. Lowering should write selects between one and zero constants into carry-out registers. AARrch64: ZeroOrOneBooleanContent Full diff: https://github.com/llvm/llvm-project/pull/116998.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index d95fc8cfbcf558..027e4ee7159b41 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -7830,9 +7830,7 @@ bool CombinerHelper::matchSuboCarryOut(const MachineInstr &MI,
case ConstantRange::OverflowResult::AlwaysOverflowsHigh: {
MatchInfo = [=](MachineIRBuilder &B) {
B.buildSub(Dst, LHS, RHS);
- B.buildConstant(Carry, getICmpTrueVal(getTargetLowering(),
- /*isVector=*/CarryTy.isVector(),
- /*isFP=*/false));
+ B.buildConstant(Carry, 1);
};
return true;
}
@@ -7855,9 +7853,7 @@ bool CombinerHelper::matchSuboCarryOut(const MachineInstr &MI,
case ConstantRange::OverflowResult::AlwaysOverflowsHigh: {
MatchInfo = [=](MachineIRBuilder &B) {
B.buildSub(Dst, LHS, RHS);
- B.buildConstant(Carry, getICmpTrueVal(getTargetLowering(),
- /*isVector=*/CarryTy.isVector(),
- /*isFP=*/false));
+ B.buildConstant(Carry, 1);
};
return true;
}
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir b/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
index 20cba54923548e..8f24ee76f82b32 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
@@ -277,3 +277,26 @@ body: |
$q1 = COPY %o_wide
RET_ReallyLR implicit $w0
...
+---
+name: usub_may_carry_non_const
+body: |
+ bb.0:
+ liveins: $w0, $w1
+ ; CHECK-LABEL: name: usub_may_carry_non_const
+ ; CHECK: liveins: $w0, $w1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $w0
+ ; CHECK-NEXT: %sub:_(s32), %o:_(s1) = G_USUBO [[COPY]], [[COPY1]]
+ ; CHECK-NEXT: %o_wide:_(s32) = G_ZEXT %o(s1)
+ ; CHECK-NEXT: $w0 = COPY %sub(s32)
+ ; CHECK-NEXT: $w1 = COPY %o_wide(s32)
+ ; CHECK-NEXT: RET_ReallyLR implicit $w0
+ %0:_(s32) = COPY $w0
+ %1:_(s32) = COPY $w0
+ %sub:_(s32), %o:_(s1) = G_USUBO %0, %1
+ %o_wide:_(s32) = G_ZEXT %o(s1)
+ $w0 = COPY %sub(s32)
+ $w1 = COPY %o_wide
+ RET_ReallyLR implicit $w0
+...
|
Can you elaborate? The carry out should be a boolean value that conforms to the target boolean content. It is not a free form value that you can just add to something.
No? It's a boolean. Its value shouldn't be directly interpreted as an addend? |
The lowering of G_UADDE blindly adds the CarryIn to the sum of
|
|
The adde combine also writes a 1 into the carry-out:
|
|
Our documentation describes the carry-in and -out as s1. It cannot be a target dependent infinity: |
I think that just mirrors what the langref says for overflow intrinsics, which basically say that output is a bit indicating a carry. The carry bit is IMO a boolean which we need to legalize into whatever representation the target needs for the boolean. Am I misunderstanding something here? |
|
Thanks!
We end up with a target specific boolean, but there are some inconsistencies in the code base. |
This is a bug. We would need an optimization combine to fold the carry-out to add-in iff the target boolean content is ZeroOrOne |
This could use elaboration about the target boolean contents |
The carry-out can participate in several positions:
ADCS: Add with carry and set flags
Add with Carry, setting flags, adds two register values and the Carry flag value, and writes the result to the destination register. It updates the condition flags based on the result.
Note that the carry-in participates in the addition. It must be a 0 or 1 value. The carry-in/out values must be 0 or 1 for the addos, subos, addes, and subos.
Including, but not limited to the lowering for G_USUBO is buggy:
MIRBuilder.buildICmp(CmpInst::ICMP_ULT, BorrowOut, LHS, RHS);The carry-out is target dependent and can lead to surprising and non-deterministic results when used as carry-in.
narrowScalarAddSub can build chains of addes/subes. When the ops are not legal for the current target and get lowered. They become chains of adds/subs and icmps. The result will become target dependent, which is incorrect.
Lowering should write selects between one and zero constants into carry-out registers.
AARrch64: ZeroOrOneBooleanContent
AMDGPU: ZeroOrNegativeOneBooleanContent
and ZeroOrOneBooleanContent
RISCV: ZeroOrOneBooleanContent